A syllable based continuous speech recognizer for Tamil

نویسندگان

  • A. Lakshmi
  • Hema A. Murthy
چکیده

This paper presents a novel technique for building a syllable based continuous speech recognizer when unannotated transcribed train data is available. We present two different segmentation algorithms to segment the speech and the corresponding text into comparable syllable like units. A group delay based two level segmentation algorithm is proposed to extract accurate syllable units from the speech data. A rule based text segmentation algorithm is used to automatically annotate the text corresponding to the speech into syllable units. Isolated style syllable models are built using multiple frame size (MFS) and multiple frame rate (MFR) for all unique syllables by collecting examples from annotated speech. Experiments performed on Tamil language show that the recognition performance is comparable to recognizers built using manually segmented train data. These experiments suggest that system development cost can be reduced by using minimum manual effort if sentence level transcription of the speech data is available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word and Triphone Based Approaches in Continuous Speech Recognition for Tamil Language

Building a continuous speech recognizer for the Indian language like Tamil is a challenging task due to the unique inherent features of the language like long and short vowels, lack of aspirated stops, aspirated consonants and many instances of allophones. Stress and accent vary in spoken Tamil language from region to region. But in formal read Tamil speech, stress and accents are ignored. Ther...

متن کامل

A Syllable Based Continuous Sp

This paper presents a novel technique for building a syllable based continuous speech recognizer when unannotated transcribed train data is available. We present two different segmentation algorithms to segment the speech and the corresponding text into comparable syllable like units. A group delay based two level segmentation algorithm is proposed to extract accurate syllable units from the sp...

متن کامل

A Syllable Based Continuous Sp

This paper presents a novel technique for building a syllable based continuous speech recognizer when unannotated transcribed train data is available. We present two different segmentation algorithms to segment the speech and the corresponding text into comparable syllable like units. A group delay based two level segmentation algorithm is proposed to extract accurate syllable units from the sp...

متن کامل

Development of Syllable Based Unit Selection Text- To-Speech Synthesis System for Tamil Using Three Level Fall Back Technique

A text-to-speech synthesis system is one that is capable of producing intelligible and natural speech corresponding to any given text. A popular approach to speech synthesis is unit selection synthesis (USS). The current work focuses on developing a USS system for Tamil. Literature suggests that syllable is a suitable unit for Indian languages. Creating a database that covers all the syllables ...

متن کامل

Design of language models at various phases of Tamil speech recognition system

This paper describes the use of language models in various phases of Tamil speech recognition system for improving its performance. In this work, the language models are applied at various levels of speech recognition such as segmentation phase, recognition phase and the syllable and word level error correction phase. The speech signals were segmented at phonetic level based on their acoustic c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006